Bridging Neural Machine Translation and Bilingual Dictionaries

نویسندگان

  • Jiajun Zhang
  • Chengqing Zong
چکیده

Neural Machine Translation (NMT) has become the new state-of-the-art in several language pairs. However, it remains a challenging problem how to integrate NMT with a bilingual dictionary which mainly contains words rarely or never seen in the bilingual training data. In this paper, we propose two methods to bridge NMT and the bilingual dictionaries. The core idea behind is to design novel models that transform the bilingual dictionaries into adequate sentence pairs, so that NMT can distil latent bilingual mappings from the ample and repetitive phenomena. One method leverages a mixed word/character model and the other attempts at synthesizing parallel sentences guaranteeing massive occurrence of the translation lexicon. Extensive experiments demonstrate that the proposed methods can remarkably improve the translation quality, and most of the rare words in the test sentences can obtain correct translations if they are covered by the dictionary.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Neural Network based Approach for English to Hindi Machine Translation

In this paper we are discussing the working of our English to Hindi Machine Translation system. Our system is able to translate English language’s simple sentences into Hindi. This system has been implemented using feed-forward backpropagation artificial neural network. ANN model does the selection of translation rules for grammar structure and Hindi words/tokens (such as verb, noun/pronoun etc...

متن کامل

Use of the Japio Technical Field Dictionaries and Commercial Rule-based Engine for NTCIR-PatentMT

Japio performs various patent-related translation businesses, and owns the original patent-document-derived bilingual technical term database (Japio Terminology Database) to be used by the translators. Currently the database contains more than 1,900,000 J-E bilingual technical terms. The Japio Technical Field Dictionaries (technical-field-oriented machine translation dictionaries) are created f...

متن کامل

Creating rich online dictionaries for the Lao-French language pair, reusable for Machine Translation

In this paper, we present how we generated two rich online bilingual dictionaries — LaoFrench and French-Lao — from unstructured dictionaries in Microsoft Word files. Then we shortly discuss the possible reuse of the lexical data for Machine Translation projects.

متن کامل

Exploiting Similarities among Languages for Machine Translation

Dictionaries and phrase tables are the basis of modern statistical machine translation systems. This paper develops a method that can automate the process of generating and extending dictionaries and phrase tables. Our method can translate missing word and phrase entries by learning language structures based on large monolingual data and mapping between languages from small bilingual data. It u...

متن کامل

Use of the Japio Technical Field Dictionaries for NTCIR-PatentMT

Japio performs various patent-related translation businesses, and owns the original patent-document-derived bilingual technical term database (Japio Terminology Database) to be used by the translators. Currently the database contains more than 1,000,000 J-E technical terms. The Japio Technical Field Dictionaries (technical-field-oriented machine translation dictionaries) are created from the Ja...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1610.07272  شماره 

صفحات  -

تاریخ انتشار 2016